Search CORE

53 research outputs found

Comparative Genomics Identifies Candidate Genes for Infectious Salmon Anemia (ISA) Resistance in Atlantic Salmon (Salmo salar)

Author: Boroevich Keith A.
Davidson William S.
Koop Ben F.
Li Jieying
Publication venue: Springer-Verlag
Publication date: 01/01/2010
Field of study

Infectious salmon anemia (ISA) has been described as the hoof and mouth disease of salmon farming. ISA is caused by a lethal and highly communicable virus, which can have a major impact on salmon aquaculture, as demonstrated by an outbreak in Chile in 2007. A quantitative trait locus (QTL) for ISA resistance has been mapped to three microsatellite markers on linkage group (LG) 8 (Chr 15) on the Atlantic salmon genetic map. We identified bacterial artificial chromosome (BAC) clones and three fingerprint contigs from the Atlantic salmon physical map that contains these markers. We made use of the extensive BAC end sequence database to extend these contigs by chromosome walking and identified additional two markers in this region. The BAC end sequences were used to search for conserved synteny between this segment of LG8 and the fish genomes that have been sequenced. An examination of the genes in the syntenic segments of the tetraodon and medaka genomes identified candidates for association with ISA resistance in Atlantic salmon based on differential expression profiles from ISA challenges or on the putative biological functions of the proteins they encode. One gene in particular, HIV-EP2/MBP-2, caught our attention as it may influence the expression of several genes that have been implicated in the response to infection by infectious salmon anemia virus (ISAV). Therefore, we suggest that HIV-EP2/MBP-2 is a very strong candidate for the gene associated with the ISAV resistance QTL in Atlantic salmon and is worthy of further study

Springer - Publisher Connector

PubMed Central

DeepInsight: a methodology to transform a non - image data to an image for convolution neural network architecture

Author: Boroevich Keith A.
Sharma Alokanand
Shigemizu Daichi
Tsunoda Tatsuhiko
Vans Edwin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/08/2019
Field of study

It is critical, but difficult, to catch the small variation in genomic or other kinds of data that differentiates phenotypes or categories. A plethora of data is available, but the information from its genes or elements is spread over arbitrarily, making it challenging to extract relevant details for identification. However, an arrangement of similar genes into clusters makes these differences more accessible and allows for robust identification of hidden mechanisms (e.g. pathways) than dealing with elements individually. Here we propose, DeepInsight, which converts non-image samples into a well-organized image-form. Thereby, the power of convolution neural network (CNN), including GPU utilization, can be realized for non-image samples. Furthermore, DeepInsight enables feature extraction through the application of CNN for non-image samples to seize imperative information and shown promising results. To our knowledge, this is the first work to apply CNN simultaneously on different kinds of non-image datasets: RNA-seq, vowels, text, and artificial

University of the South Pacific Electronic Research Repository

Prediction Models of Breast Cancer Outcome

Author: Boroevich Keith A
Iwase Takuji
Katagiri Toyomasa
Miya Fuyuki
Shigemizu Daichi
Suzuki Yasuyo
Tsunoda Tatsuhiko
Yoshimoto Masataka
Zembutsu Hitoshi
Publication venue: 'Wiley'
Publication date: 27/10/2020
Field of study

The goal of this study is to establish a method for predicting overall survival (OS ) and disease‐free survival (DFS ) in breast cancer patients after surgical operation. The gene expression profiles of cancer tissues from the patients, who underwent complete surgical resection of breast cancer and were subsequently monitored for postoperative survival, were analyzed using cDNA microarrays. We detected seven and three probes/genes associated with the postoperative OS and DFS , respectively, from our discovery cohort data. By incorporating these genes associated with the postoperative survival into MammaPrint genes, often used to predict prognosis of patients with early‐stage breast cancer, we constructed postoperative OS and DFS prediction models from the discovery cohort data using a Cox proportional hazard model. The predictive ability of the models was evaluated in another independent cohort using Kaplan–Meier (KM ) curves and the area under the receiver operating characteristic curve (AUC ). The KM curves showed a statistically significant difference between the predicted high‐ and low‐risk groups in both OS (log‐rank trend test P = 0.0033) and DFS (log‐rank trend test P = 0.00030). The models also achieved high AUC scores of 0.71 in OS and of 0.60 in DFS . Furthermore, our models had improved KM curves when compared to the models using MammaPrint genes (OS : P = 0.0058, DFS : P = 0.00054). Similar results were observed when our model was tested in publicly available datasets. These observations indicate that there is still room for improvement in the current methods of predicting postoperative OS and DFS in breast cancer

Tokushima University Institutional Repository

Assessing the feasibility of GS FLX Pyrosequencing for sequencing the Atlantic salmon genome

Author: Boroevich Keith A
Bouffard Pascal
Chow William
Davidson William S
Desany Brian A
Harkins Timothy T
Jarvie Thomas P
Knight James R
Koop Ben F
Levenkova Natasha
Lubieniecki Krzysztof P
Quinn Nicole L
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background With a whole genome duplication event and wealth of biological data, salmonids are excellent model organisms for studying evolutionary processes, fates of duplicated genes and genetic and physiological processes associated with complex behavioral phenotypes. It is surprising therefore, that no salmonid genome has been sequenced. Atlantic salmon (<it>Salmo salar</it>) is a good representative salmonid for sequencing given its importance in aquaculture and the genomic resources available. However, the size and complexity of the genome combined with the lack of a sequenced reference genome from a closely related fish makes assembly challenging. Given the cost and time limitations of Sanger sequencing as well as recent improvements to next generation sequencing technologies, we examined the feasibility of using the Genome Sequencer (GS) FLX pyrosequencing system to obtain the sequence of a salmonid genome. Eight pooled BACs belonging to a minimum tiling path covering ~1 Mb of the Atlantic salmon genome were sequenced by GS FLX shotgun and Long Paired End sequencing and compared with a ninth BAC sequenced by Sanger sequencing of a shotgun library. Results An initial assembly using only GS FLX shotgun sequences (average read length 248.5 bp) with ~30× coverage allowed gene identification, but was incomplete even when 126 Sanger-generated BAC-end sequences (~0.09× coverage) were incorporated. The addition of paired end sequencing reads (additional ~26× coverage) produced a final assembly comprising 175 contigs assembled into four scaffolds with 171 gaps. Sanger sequencing of the ninth BAC (~10.5× coverage) produced nine contigs and two scaffolds. The number of scaffolds produced by the GS FLX assembly was comparable to Sanger-generated sequencing; however, the number of gaps was much higher in the GS FLX assembly. Conclusion These results represent the first use of GS FLX paired end reads for <it>de novo </it>sequence assembly. Our data demonstrated that this improved the GS FLX assemblies; however, with respect to <it>de novo </it>sequencing of complex genomes, the GS FLX technology is limited to gene mining and establishing a set of ordered sequence contigs. Currently, for a salmonid reference sequence, it appears that a substantial portion of sequencing should be done using Sanger technology.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Simon Fraser University Institutional Repository

Stepwise iterative maximum likelihood clustering approach

Author: A Ben-Hur
A Cd
A Sharma
A Sharma
A Sharma
A Sharma
AK Jain
Alok Sharma
AP Dempster
B Mirkin
C Chen
D Defays
D Fisher
Daichi Shigemizu
E Elhamifar
E-J Yeoh
EF Lock
EK Latch
ER Berndt
I Misztal
J Felsenstein
J Khan
J Lee
J Lee
J-H Chiang
JS Liu
JS Long
K Wang
Keith A. Boroevich
M Ramoni
MD Wilkerson
Michiaki Kubo
MM Rahman
Q Mo
R Fletcher
R Sibson
RI Jennrich
S Farrell
S Jun
S Monti
S-J Horng
SA Armstrong
T Denoeux
T Hastie
Tatsuhiko Tsunoda
WC Davidon
X Zheng
Y Yamaguchi-Kabata
Yoichiro Kamatani
Yosvany López
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Genomic organization and evolution of the Atlantic salmon hemoglobin repertoire

Abstract Background The genomes of salmonids are considered pseudo-tetraploid undergoing reversion to a stable diploid state. Given the genome duplication and extensive biological data available for salmonids, they are excellent model organisms for studying comparative genomics, evolutionary processes, fates of duplicated genes and the genetic and physiological processes associated with complex behavioral phenotypes. The evolution of the tetrapod hemoglobin genes is well studied; however, little is known about the genomic organization and evolution of teleost hemoglobin genes, particularly those of salmonids. The Atlantic salmon serves as a representative salmonid species for genomics studies. Given the well documented role of hemoglobin in adaptation to varied environmental conditions as well as its use as a model protein for evolutionary analyses, an understanding of the genomic structure and organization of the Atlantic salmon α and β hemoglobin genes is of great interest. Results We identified four bacterial artificial chromosomes (BACs) comprising two hemoglobin gene clusters spanning the entire α and β hemoglobin gene repertoire of the Atlantic salmon genome. Their chromosomal locations were established using fluorescence <it>in situ </it>hybridization (FISH) analysis and linkage mapping, demonstrating that the two clusters are located on separate chromosomes. The BACs were sequenced and assembled into scaffolds, which were annotated for putatively functional and pseudogenized hemoglobin-like genes. This revealed that the tail-to-tail organization and alternating pattern of the α and β hemoglobin genes are well conserved in both clusters, as well as that the Atlantic salmon genome houses substantially more hemoglobin genes, including non-Bohr β globin genes, than the genomes of other teleosts that have been sequenced. Conclusions We suggest that the most parsimonious evolutionary path leading to the present organization of the Atlantic salmon hemoglobin genes involves the loss of a single hemoglobin gene cluster after the whole genome duplication (WGD) at the base of the teleost radiation but prior to the salmonid-specific WGD, which then produced the duplicated copies seen today. We also propose that the relatively high number of hemoglobin genes as well as the presence of non-Bohr β hemoglobin genes may be due to the dynamic life history of salmon and the diverse environmental conditions that the species encounters. Data deposition: BACs S0155C07 and S0079J05 (fps135): GenBank <ext-link ext-link-id="GQ898924" ext-link-type="gen">GQ898924</ext-link>; BACs S0055H05 and S0014B03 (fps1046): GenBank <ext-link ext-link-id="GQ898925" ext-link-type="gen">GQ898925</ext-link></p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Simon Fraser University Institutional Repository

Pathway and network analysis of more than 2500 whole cancer genomes.

The catalog of cancer driver mutations in protein-coding genes has greatly expanded in the past decade. However, non-coding cancer driver mutations are less well-characterized and only a handful of recurrent non-coding mutations, most notably TERT promoter mutations, have been reported. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancer across 38 tumor types, we perform multi-faceted pathway and network analyses of non-coding mutations across 2583 whole cancer genomes from 27 tumor types compiled by the ICGC/TCGA PCAWG project that was motivated by the success of pathway and network analyses in prioritizing rare mutations in protein-coding genes. While few non-coding genomic elements are recurrently mutated in this cohort, we identify 93 genes harboring non-coding mutations that cluster into several modules of interacting proteins. Among these are promoter mutations associated with reduced mRNA expression in TP53, TLE4, and TCF4. We find that biological processes had variable proportions of coding and non-coding mutations, with chromatin remodeling and proliferation pathways altered primarily by coding mutations, while developmental pathways, including Wnt and Notch, altered by both coding and non-coding mutations. RNA splicing is primarily altered by non-coding mutations in this cohort, and samples containing non-coding mutations in well-known RNA splicing factors exhibit similar gene expression signatures as samples with coding mutations in these genes. These analyses contribute a new repertoire of possible cancer genes and mechanisms that are altered by non-coding mutations and offer insights into additional cancer vulnerabilities that can be investigated for potential therapeutic treatments

Repository for Publications and Research Data

DSpace@MIT

Lund University Publications

Ghent University Academic Bibliography

Publikationer från Uppsala Universitet

eScholarship - University of California

Digitala Vetenskapliga Arkivet - Academic Archive On-line

UPF Digital Repository

Apollo (Cambridge)

Bern Open Repository and Information System (BORIS)

Online Research Database In Technology

University of Queensland eSpace

Cancer LncRNA Census reveals evidence for deep functional conservation of long noncoding RNAs in tumorigenesis.

Author: Abascal Federico
Amin Samirkumar B.
Bader Gary D.
Barenboim Jonathan
Beroukhim Rameen
Bertl Johanna
Boroevich Keith A.
Brunak Soren
Campbell Peter J.
Carlevaro-Fita Joana
Carlevaro-Fita Joana
Chakravarty Dimple
Chan Calvin Wing Yiu
Chen Ken
Choi Jung Kyoon
Deu-Pons Jordi
Dhingra Priyanka
Diamanti Klev
Feuerbach Lars
Feuerbach Lars
Fink J. Lynn
Fonseca Nuno A.
Frigola Joan
Gambacorti-Passerini Carlo
Garsed Dale W.
Gerstein Mark
Getz Gad
Gonzalez-Perez Abel
Guo Qianyun
Gut Ivo G.
Haan David
Hamilton Mark P.
Haradhvala Nicholas J.
Harmanci Arif O.
Helmy Mohamed
Herrmann Carl
Hess Julian M.
Hobolth Asger
Hodzic Ermin
Hong Chen
Hong Chen
Hornshoj Henrik
Isaev Keren
Izarzugaza Jose M. G.
Johnson Rory
Johnson Todd A.
Juul Malene
Juul Randi Istrup
Kahles Andre
Kahraman Abdullah
Kellis Manolis
Khurana Ekta
Kim Jaegil
Kim Jong K.
Kim Youngwook
Komorowski Jan
Korbel Jan O.
Kumar Sushant
Lanzos Andres
Lanzos Andres
Larsson Erik
Lawrence Michael S.
Lee Donghoon
Lehmann Kjong-Van
Li Shantao
Li Xiaotong
Lin Ziao
Liu Eric Minwei
Lochovsky Lucas
Lou Shaoke
Madsen Tobias
Marchal Kathleen
Martincorena Inigo
Martinez-Fundichely Alexander
Maruvka Yosef E.
Mas-Ponte David
McGillivray Patrick D.
Meyerson William
Muinos Ferran
Mularoni Loris
Nakagawa Hidewaki
Nielsen Morten Muhlig
Paczkowska Marta
Park Keunchil
Park Kiejung
Pedersen Jakob Skou
Pedersen Jakob Skou
Pich Oriol
Pons Tirso
Pulido-Tamayo Sergio
Raphael Benjamin J.
Reimand Juri
Reyes-Salazar Iker
Reyna Matthew A.
Rheinbay Esther
Rubin Mark A.
Rubio-Perez Carlota
Sabarinathan Radhakrishnan
Sahinalp S. Cenk
Saksena Gordon
Salichos Leonidas
Sander Chris
Schumacher Steven E.
Shackleton Mark
Shapira Ofer
Shen Ciyue
Shrestha Raunak
Shuai Shimin
Sidiropoulos Nikos
Sieverling Lina
Sinnott-Armstrong Nasa
Stein Lincoln D.
Stuart Joshua M.
Tamborero David
Tiao Grace
Tsunoda Tatsuhiko
Umer Husen M.
Uuskula-Reimand Liis
Valencia Alfonso
Vazquez Miguel
Verbeke Lieven P. C.
von Mering Christian
Wadelius Claes
Wadi Lina
Wang Jiayin
Warrell Jonathan
Waszak Sebastian M.
Weischenfeldt Joachim
Wheeler David A.
Wu Guanming
Yu Jun
Zhang Jing
Zhang Xuanping
Zhang Yan
Zhao Zhongming
Zou Lihua
Publication venue: Commun Biol
Publication date: 01/01/2020
Field of study

Long non-coding RNAs (lncRNAs) are a growing focus of cancer genomics studies, creating the need for a resource of lncRNAs with validated cancer roles. Furthermore, it remains debated whether mutated lncRNAs can drive tumorigenesis, and whether such functions could be conserved during evolution. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we introduce the Cancer LncRNA Census (CLC), a compilation of 122 GENCODE lncRNAs with causal roles in cancer phenotypes. In contrast to existing databases, CLC requires strong functional or genetic evidence. CLC genes are enriched amongst driver genes predicted from somatic mutations, and display characteristic genomic features. Strikingly, CLC genes are enriched for driver mutations from unbiased, genome-wide transposon-mutagenesis screens in mice. We identified 10 tumour-causing mutations in orthologues of 8 lncRNAs, including LINC-PINT and NEAT1, but not MALAT1. Thus CLC represents a dataset of high-confidence cancer lncRNAs. Mutagenesis maps are a novel means for identifying deeply-conserved roles of lncRNAs in tumorigenesis

Repository for Publications and Research Data

DSpace@MIT

Lund University Publications

Publikationer från Uppsala Universitet

Ghent University Academic Bibliography

Digitala Vetenskapliga Arkivet - Academic Archive On-line

UPF Digital Repository

Apollo (Cambridge)

Bern Open Repository and Information System (BORIS)

Analyses of non-coding somatic drivers in 2,658 cancer whole genomes.

Author: Abascal Federico
Akdemir Kadir C.
Alvarez Eva G.
Amin Samirkumar B.
Bader Gary D.
Baez-Ortega Adrian
Bandopadhayay Pratiti
Barenboim Jonathan
Beroukhim Rameen
Bertl Johanna
Boroevich Keith A.
Boutros Paul C.
Bowtell David D. L.
Brors Benedikt
Brunak Soren
Burns Kathleen H.
Busanovich John
Campbell Peter J.
Carlevaro-Fita Joana
Chakravarty Dimple
Chan Calvin Wing Yiu
Chan Kin
Chen Ken
Choi Jung Kyoon
CortesCiriano Isidro
Craft David
Deu-Pons Jordi
Dhingra Priyanka
Diamanti Klev
Dueso-Barroso Ana
Dunford Andrew J.
Edwards Paul A.
Estivill Xavier
Etemadmoghadam Dariush
Feuerbach Lars
Fink J. Lynn
Fonseca Nuno A.
Frenkel-Morgenstern Milana
Frigola Joan
Gambacorti-Passerini Carlo
Garsed Dale W.
Gerstein Mark
Getz Gad
Gonzalez-Perez Abel
Gordenin Dmitry A.
Guo Qianyun
Gut Ivo G.
Haan David
Haber James E.
Hamilton Mark P.
Haradhvala Nicholas J.
Harmanci Arif O.
Helmy Mohamed
Herrmann Carl
Hess Julian M.
Hobolth Asger
Hodzic Ermin
Hong Chen
Hornshoj Henrik
Hutter Barbara
Imielinski Marcin
Isaev Keren
Izarzugaza Jose M. G.
Johnson Rory
Johnson Todd A.
Jones David T. W.
Ju Young Seok
Juul Malene
Juul Randi Istrup
Kahles Andre
Kahraman Abdullah
Kazanov Marat D.
Kellis Manolis
Khurana Ekta
Kim Jaegil
Kim Jong K.
Kim Youngwook
Klimczak Leszek J.
Koh Youngil
Komorowski Jan
Korbel Jan O.
Kumar Kiran
Kumar Sushant
Lanzos Andres
Larsson Erik
Lawrence Michael S.
Lee Donghoon
Lee Eunjung Alice
Lee Jake June-Koo
Lehmann Kjong-Van
Li Shantao
Li Xiaotong
Li Yilong
Lin Ziao
Liu Eric Minwei
Lochovsky Lucas
Lopez-Bigas Nuria
Lou Shaoke
Lynch Andy G.
Macintyre Geoff
Madsen Tobias
Marchal Kathleen
Markowetz Florian
Martincorena Inigo
Martinez-Fundichely Alexander
Maruvka Yosef E.
McGillivray Patrick D.
Meyerson Matthew
Meyerson William
Miyano Satoru
Muinos Ferran
Mularoni Loris
Nakagawa Hidewaki
Navarro Fabio C. P.
Nielsen Morten Muhlig
Ossowski Stephan
Paczkowska Marta
Park Keunchil
Park Kiejung
Park Peter J.
Pearson John, V
Pedersen Jakob Skou
Pich Oriol
Pons Tirso
Puiggros Montserrat
Pulido-Tamayo Sergio
Raphael Benjamin J.
Reimand Juri
Reyes-Salazar Iker
Reyna Matthew A.
Rheinbay Esther
Rippe Karsten
Roberts Nicola D.
Roberts Steven A.
RodriguezMartin Bernardo
Rubin Mark A.
Rubio-Perez Carlota
Sabarinathan Radhakrishnan
Sahinalp S. Cenk
Saksena Gordon
Salichos Leonidas
Sander Chris
Schumacher Steven E.
Scully Ralph
Shackleton Mark
Shapira Ofer
Shen Ciyue
Shrestha Raunak
Shuai Shimin
Sidiropoulos Nikos
Sieverling Lina
Sinnott-Armstrong Nasa
Stein Lincoln D.
Stewart Chip
Stuart Joshua M.
Tamborero David
Tiao Grace
Torrents David
Tsunoda Tatsuhiko
Tubio Jose M. C.
Umer Husen Muhammad
Uuskula-Reimand Liis
Valencia Alfonso
Vazquez Miguel
Verbeke Lieven P. C.
Villasante Izar
von Mering Christian
Waddell Nicola
Wadelius Claes
Wadi Lina
Wala Jeremiah A.
Wang Jiayin
Warrell Jonathan
Waszak Sebastian M.
Weischenfeldt Joachim
Wheeler David A.
Wu Guanming
Yang Lixing
Yao Xiaotong
Yoon Sung-Soo
Yu Jun
Zamora Jorge
Zhang Cheng-Zhong
Zhang Jing
Zhang Xuanping
Zhang Yan
Zhao Zhongming
Zou Lihua
Publication venue: Nature
Publication date: 01/01/2020
Field of study

The discovery of drivers of cancer has traditionally focused on protein-coding genes1-4. Here we present analyses of driver point mutations and structural variants in non-coding regions across 2,658 genomes from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium5 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). For point mutations, we developed a statistically rigorous strategy for combining significance levels from multiple methods of driver discovery that overcomes the limitations of individual methods. For structural variants, we present two methods of driver discovery, and identify regions that are significantly affected by recurrent breakpoints and recurrent somatic juxtapositions. Our analyses confirm previously reported drivers6,7, raise doubts about others and identify novel candidates, including point mutations in the 5' region of TP53, in the 3' untranslated regions of NFKBIZ and TOB1, focal deletions in BRD4 and rearrangements in the loci of AKR1C genes. We show that although point mutations and structural variants that drive cancer are less frequent in non-coding genes and regulatory sequences than in protein-coding genes, additional examples of these drivers will be found as more cancer genomes become available

Publikationsserver der Universität Tübingen

Digitala Vetenskapliga Arkivet - Academic Archive On-line

UPF Digital Repository

Repository for Publications and Research Data

DSpace@MIT

Lund University Publications

Ghent University Academic Bibliography

Publikationer från Uppsala Universitet

UCL Discovery

Copenhagen University Research Information System

eScholarship - University of California

Apollo (Cambridge)